{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "---\n",
    "description: Automatic Evaluation with Pre-implemented Metrics\n",
    "---\n",
    "\n",
    "# Evaluation\n",
    "\n",
    "Welcome to the \"_Evaluation_\" tutorial of the \"_From Zero to Hero_\" series. In this part we will present the functionalities offered by the `evaluation` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!pip install avalanche-lib==0.5"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 📈 The Evaluation Module\n",
    "\n",
    "\n",
    "\n",
    "The `evaluation` module is quite straightforward: it offers all the basic functionalities to evaluate and keep track of a continual learning experiment.\n",
    "\n",
    "This is mostly done through the **Metrics**: a set of classes which implement the main continual learning metrics computation like A_ccuracy_, F_orgetting_, M_emory Usage_, R_unning Times_, etc. At the moment, in _Avalanche_ we offer a number of pre-implemented metrics you can use for your own experiments. We made sure to include all the major accuracy-based metrics but also the ones related to computation and memory.\n",
    "\n",
    "Each metric comes with a standalone class and a set of plugin classes aimed at emitting metric values on specific moments during training and evaluation.\n",
    "\n",
    "#### Standalone metric\n",
    "\n",
    "As an example, the standalone `Accuracy` class can be used to monitor the average accuracy over a stream of `<input,target>` pairs. The class provides an `update` method to update the current average accuracy, a `result` method to print the current average accuracy and a `reset` method to set the current average accuracy to zero. The call to `result`does not change the metric state.  \n",
    "\n",
    "The `TaskAwareAccuracy` metric keeps separate accuracy counters for different task labels. As such, it requires the `task_labels` parameter, which specifies which task is associated with the current patterns. The metric returns a dictionary mapping task labels to accuracy values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Initial Accuracy:  0.0\n",
      "Average Accuracy:  0.5\n",
      "Average Accuracy:  0.75\n",
      "After reset:  0.0\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from avalanche.evaluation.metrics import Accuracy, TaskAwareAccuracy\n",
    "\n",
    "# create an instance of the standalone Accuracy metric\n",
    "# initial accuracy is 0\n",
    "acc_metric = Accuracy()\n",
    "print(\"Initial Accuracy: \", acc_metric.result()) #  output 0.0\n",
    "\n",
    "# two consecutive metric updates\n",
    "real_y = torch.tensor([1, 2]).long()\n",
    "predicted_y = torch.tensor([1, 0]).float()\n",
    "acc_metric.update(real_y, predicted_y)\n",
    "acc = acc_metric.result()\n",
    "print(\"Average Accuracy: \", acc) # output 0.5\n",
    "predicted_y = torch.tensor([1,2]).float()\n",
    "acc_metric.update(real_y, predicted_y)\n",
    "acc = acc_metric.result()\n",
    "print(\"Average Accuracy: \", acc) # output 0.75\n",
    "\n",
    "# reset accuracy\n",
    "acc_metric.reset()\n",
    "print(\"After reset: \", acc_metric.result()) # output 0.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Initial Accuracy:  {}\n",
      "Average Accuracy:  {0: 0.5}\n",
      "Average Accuracy:  {0: 0.5, 1: 1.0}\n",
      "Average Accuracy:  {0: 0.75, 1: 1.0}\n",
      "After reset:  {}\n"
     ]
    }
   ],
   "source": [
    "# create an instance of the standalone TaskAwareAccuracy metric\n",
    "# initial accuracy is 0 for each task\n",
    "acc_metric = TaskAwareAccuracy()\n",
    "print(\"Initial Accuracy: \", acc_metric.result()) #  output {}\n",
    "\n",
    "# metric updates for 2 different tasks\n",
    "task_label = 0\n",
    "real_y = torch.tensor([1, 2]).long()\n",
    "predicted_y = torch.tensor([1, 0]).float()\n",
    "acc_metric.update(real_y, predicted_y, task_label)\n",
    "acc = acc_metric.result()\n",
    "print(\"Average Accuracy: \", acc) # output 0.5 for task 0\n",
    "\n",
    "task_label = 1\n",
    "predicted_y = torch.tensor([1,2]).float()\n",
    "acc_metric.update(real_y, predicted_y, task_label)\n",
    "acc = acc_metric.result() \n",
    "print(\"Average Accuracy: \", acc) # output 0.75 for task 0 and 1.0 for task 1\n",
    "\n",
    "task_label = 0\n",
    "predicted_y = torch.tensor([1,2]).float()\n",
    "acc_metric.update(real_y, predicted_y, task_label)\n",
    "acc = acc_metric.result()\n",
    "print(\"Average Accuracy: \", acc) # output 0.75 for task 0 and 1.0 for task 1\n",
    "\n",
    "# reset accuracy\n",
    "acc_metric.reset()\n",
    "print(\"After reset: \", acc_metric.result()) # output {}"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plugin metric\n",
    "\n",
    "If you want to integrate the available metrics automatically in the training and evaluation flow, you can use plugin metrics, like `EpochAccuracy` which logs the accuracy after each training epoch, or `ExperienceAccuracy` which logs the accuracy after each evaluation experience. Each of these metrics emits a **curve** composed by its values at different points in time \\(e.g. on different training epochs\\).  In order to simplify the use of these metrics, we provided utility functions with which you can create different plugin metrics in one shot. The results of these functions can be passed as parameters directly to the `EvaluationPlugin`\\(see below\\).\n",
    "\n",
    "{% hint style=\"info\" %}\n",
    "We recommend to use the helper functions when creating plugin metrics.\n",
    "{% endhint %}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.evaluation.metrics import accuracy_metrics, \\\n",
    "    loss_metrics, forgetting_metrics, bwt_metrics,\\\n",
    "    confusion_matrix_metrics, cpu_usage_metrics, \\\n",
    "    disk_usage_metrics, gpu_usage_metrics, MAC_metrics, \\\n",
    "    ram_usage_metrics, timing_metrics\n",
    "\n",
    "# you may pass the result to the EvaluationPlugin\n",
    "metrics = accuracy_metrics(epoch=True, experience=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 📐Evaluation Plugin\n",
    "\n",
    "The **Evaluation Plugin** is the object in charge of configuring and controlling the evaluation procedure. This object can be passed to a Strategy as a \"special\" plugin through the evaluator attribute.\n",
    "\n",
    "The Evaluation Plugin accepts as inputs the plugin metrics you want to track. In addition, you can add one or more loggers to print the metrics in different ways \\(on file, on standard output, on Tensorboard...\\).\n",
    "\n",
    "It is also recommended to pass to the Evaluation Plugin the benchmark instance used in the experiment. This allows the plugin to check for consistency during metrics computation. For example, the Evaluation Plugin checks that the `strategy.eval` calls are performed on the same stream or sub-stream. Otherwise, same metric could refer to different portions of the stream.  \n",
    "These checks can be configured to raise errors (stopping computation) or only warnings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Starting experiment...\n",
      "-- >> Start of training phase << --\n",
      "100%|██████████| 24/24 [00:01<00:00, 14.55it/s]\n",
      "Epoch 0 ended.\n",
      "\tDiskUsage_Epoch/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tDiskUsage_MB/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tLoss_Epoch/train_phase/train_stream/Task000 = 1.2579\n",
      "\tLoss_MB/train_phase/train_stream/Task000 = 0.3772\n",
      "\tTime_Epoch/train_phase/train_stream/Task000 = 1.6089\n",
      "\tTop1_Acc_Epoch/train_phase/train_stream/Task000 = 0.6578\n",
      "\tTop1_Acc_MB/train_phase/train_stream/Task000 = 0.9482\n",
      "-- >> End of training phase << --\n",
      "Training completed\n",
      "Computing accuracy on the whole test set\n",
      "-- >> Start of eval phase << --\n",
      "-- Starting eval on experience 0 (Task 0) from test stream --\n",
      "100%|██████████| 20/20 [00:01<00:00, 17.96it/s]\n",
      "> Eval on experience 0 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 102.9163\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp000 = 0.3289\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.9678\n",
      "-- Starting eval on experience 1 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 18.00it/s]\n",
      "> Eval on experience 1 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 102.8231\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp001 = 5.2709\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.0000\n",
      "-- Starting eval on experience 2 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 18.59it/s]\n",
      "> Eval on experience 2 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 102.9969\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp002 = 4.4716\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp002 = 0.0000\n",
      "-- Starting eval on experience 3 (Task 0) from test stream --\n",
      "100%|██████████| 22/22 [00:01<00:00, 18.48it/s]\n",
      "> Eval on experience 3 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 102.9186\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp003 = 4.4586\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp003 = 0.0000\n",
      "-- Starting eval on experience 4 (Task 0) from test stream --\n",
      "100%|██████████| 19/19 [00:01<00:00, 18.68it/s]\n",
      "> Eval on experience 4 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 102.9407\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp004 = 4.6726\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp004 = 0.0000\n",
      "-- >> End of eval phase << --\n",
      "\tConfusionMatrix_Stream/eval_phase/test_stream = \n",
      "tensor([[   0,    0,    0,    0,  104,    0,    0,    0,  876,    0],\n",
      "        [   0,    0,    0,    0,    4,    0,    0,    0, 1131,    0],\n",
      "        [   0,    0,    0,    0,  207,    0,    0,    0,  825,    0],\n",
      "        [   0,    0,    0,    0,   41,    0,    0,    0,  969,    0],\n",
      "        [   0,    0,    0,    0,  945,    0,    0,    0,   37,    0],\n",
      "        [   0,    0,    0,    0,   97,    0,    0,    0,  795,    0],\n",
      "        [   0,    0,    0,    0,  522,    0,    0,    0,  436,    0],\n",
      "        [   0,    0,    0,    0,  662,    0,    0,    0,  366,    0],\n",
      "        [   0,    0,    0,    0,   26,    0,    0,    0,  948,    0],\n",
      "        [   0,    0,    0,    0,  848,    0,    0,    0,  161,    0]])\n",
      "\tDiskUsage_Stream/eval_phase/test_stream/Task000 = 547942.4619\n",
      "\tLoss_Stream/eval_phase/test_stream/Task000 = 3.8588\n",
      "\tStreamForgetting/eval_phase/test_stream = 0.0000\n",
      "\tTop1_Acc_Stream/eval_phase/test_stream/Task000 = 0.1893\n",
      "-- >> Start of training phase << --\n",
      "100%|██████████| 24/24 [00:01<00:00, 15.08it/s]\n",
      "Epoch 0 ended.\n",
      "\tDiskUsage_Epoch/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tDiskUsage_MB/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tLoss_Epoch/train_phase/train_stream/Task000 = 2.1896\n",
      "\tLoss_MB/train_phase/train_stream/Task000 = 0.3944\n",
      "\tTime_Epoch/train_phase/train_stream/Task000 = 1.5381\n",
      "\tTop1_Acc_Epoch/train_phase/train_stream/Task000 = 0.4541\n",
      "\tTop1_Acc_MB/train_phase/train_stream/Task000 = 0.9533\n",
      "-- >> End of training phase << --\n",
      "Training completed\n",
      "Computing accuracy on the whole test set\n",
      "-- >> Start of eval phase << --\n",
      "-- Starting eval on experience 0 (Task 0) from test stream --\n",
      "100%|██████████| 20/20 [00:01<00:00, 19.00it/s]\n",
      "> Eval on experience 0 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 103.0742\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp000 = 0.9678\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp000 = 3.4295\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0000\n",
      "-- Starting eval on experience 1 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 17.08it/s]\n",
      "> Eval on experience 1 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 102.6196\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp001 = 0.3179\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.9618\n",
      "-- Starting eval on experience 2 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 18.42it/s]\n",
      "> Eval on experience 2 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 102.7771\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp002 = 4.2860\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp002 = 0.0000\n",
      "-- Starting eval on experience 3 (Task 0) from test stream --\n",
      "100%|██████████| 22/22 [00:01<00:00, 19.16it/s]\n",
      "> Eval on experience 3 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 103.1952\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp003 = 4.3263\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp003 = 0.0000\n",
      "-- Starting eval on experience 4 (Task 0) from test stream --\n",
      "100%|██████████| 19/19 [00:01<00:00, 17.81it/s]\n",
      "> Eval on experience 4 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 102.7549\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp004 = 4.5874\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp004 = 0.0000\n",
      "-- >> End of eval phase << --\n",
      "\tConfusionMatrix_Stream/eval_phase/test_stream = \n",
      "tensor([[  0,   0, 812,   0,   0,   0,   0,   0,   0, 168],\n",
      "        [  0,   0, 792,   0,   0,   0,   0,   0,   0, 343],\n",
      "        [  0,   0, 987,   0,   0,   0,   0,   0,   0,  45],\n",
      "        [  0,   0, 675,   0,   0,   0,   0,   0,   0, 335],\n",
      "        [  0,   0,  51,   0,   0,   0,   0,   0,   0, 931],\n",
      "        [  0,   0, 336,   0,   0,   0,   0,   0,   0, 556],\n",
      "        [  0,   0, 800,   0,   0,   0,   0,   0,   0, 158],\n",
      "        [  0,   0,  83,   0,   0,   0,   0,   0,   0, 945],\n",
      "        [  0,   0, 424,   0,   0,   0,   0,   0,   0, 550],\n",
      "        [  0,   0,  33,   0,   0,   0,   0,   0,   0, 976]])\n",
      "\tDiskUsage_Stream/eval_phase/test_stream/Task000 = 547942.4619\n",
      "\tLoss_Stream/eval_phase/test_stream/Task000 = 3.3730\n",
      "\tStreamForgetting/eval_phase/test_stream = 0.9678\n",
      "\tTop1_Acc_Stream/eval_phase/test_stream/Task000 = 0.1963\n",
      "-- >> Start of training phase << --\n",
      "100%|██████████| 25/25 [00:01<00:00, 14.15it/s]\n",
      "Epoch 0 ended.\n",
      "\tDiskUsage_Epoch/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tDiskUsage_MB/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tLoss_Epoch/train_phase/train_stream/Task000 = 1.9514\n",
      "\tLoss_MB/train_phase/train_stream/Task000 = 0.4204\n",
      "\tTime_Epoch/train_phase/train_stream/Task000 = 1.7145\n",
      "\tTop1_Acc_Epoch/train_phase/train_stream/Task000 = 0.4488\n",
      "\tTop1_Acc_MB/train_phase/train_stream/Task000 = 0.9681\n",
      "-- >> End of training phase << --\n",
      "Training completed\n",
      "Computing accuracy on the whole test set\n",
      "-- >> Start of eval phase << --\n",
      "-- Starting eval on experience 0 (Task 0) from test stream --\n",
      "100%|██████████| 20/20 [00:01<00:00, 17.29it/s]\n",
      "> Eval on experience 0 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 102.6725\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp000 = 0.9678\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp000 = 3.2081\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0000\n",
      "-- Starting eval on experience 1 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 17.64it/s]\n",
      "> Eval on experience 1 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 102.8192\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp001 = 0.8030\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp001 = 1.8465\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.1587\n",
      "-- Starting eval on experience 2 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 19.17it/s]\n",
      "> Eval on experience 2 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 102.8407\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp002 = 0.3274\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp002 = 0.9831\n",
      "-- Starting eval on experience 3 (Task 0) from test stream --\n",
      "100%|██████████| 22/22 [00:01<00:00, 18.39it/s]\n",
      "> Eval on experience 3 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 102.8803\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp003 = 3.6570\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp003 = 0.0000\n",
      "-- Starting eval on experience 4 (Task 0) from test stream --\n",
      "100%|██████████| 19/19 [00:01<00:00, 17.92it/s]\n",
      "> Eval on experience 4 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 102.7332\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp004 = 3.9971\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp004 = 0.0000\n",
      "-- >> End of eval phase << --\n",
      "\tConfusionMatrix_Stream/eval_phase/test_stream = \n",
      "tensor([[ 973,    0,    0,    0,    0,    0,    0,    7,    0,    0],\n",
      "        [  11,    0,  185,    0,    0,    0,    0,  939,    0,    0],\n",
      "        [ 547,    0,  323,    0,    0,    0,    0,  161,    0,    1],\n",
      "        [ 690,    0,    8,    0,    0,    0,    0,  312,    0,    0],\n",
      "        [  96,    0,    1,    0,    0,    0,    0,  881,    0,    4],\n",
      "        [ 639,    0,    1,    0,    0,    0,    0,  252,    0,    0],\n",
      "        [ 741,    0,   18,    0,    0,    0,    0,  199,    0,    0],\n",
      "        [  23,    0,    4,    0,    0,    0,    0, 1001,    0,    0],\n",
      "        [ 477,    0,   12,    0,    0,    0,    0,  485,    0,    0],\n",
      "        [  66,    0,    0,    0,    0,    0,    0,  942,    0,    1]])\n",
      "\tDiskUsage_Stream/eval_phase/test_stream/Task000 = 547942.4619\n",
      "\tLoss_Stream/eval_phase/test_stream/Task000 = 2.5940\n",
      "\tStreamForgetting/eval_phase/test_stream = 0.8854\n",
      "\tTop1_Acc_Stream/eval_phase/test_stream/Task000 = 0.2298\n",
      "-- >> Start of training phase << --\n",
      "100%|██████████| 26/26 [00:01<00:00, 14.37it/s]\n",
      "Epoch 0 ended.\n",
      "\tDiskUsage_Epoch/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tDiskUsage_MB/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tLoss_Epoch/train_phase/train_stream/Task000 = 2.0999\n",
      "\tLoss_MB/train_phase/train_stream/Task000 = 0.6374\n",
      "\tTime_Epoch/train_phase/train_stream/Task000 = 1.7569\n",
      "\tTop1_Acc_Epoch/train_phase/train_stream/Task000 = 0.3801\n",
      "\tTop1_Acc_MB/train_phase/train_stream/Task000 = 0.9678\n",
      "-- >> End of training phase << --\n",
      "Training completed\n",
      "Computing accuracy on the whole test set\n",
      "-- >> Start of eval phase << --\n",
      "-- Starting eval on experience 0 (Task 0) from test stream --\n",
      "100%|██████████| 20/20 [00:01<00:00, 18.18it/s]\n",
      "> Eval on experience 0 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 102.9050\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp000 = 0.9678\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp000 = 3.1870\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0000\n",
      "-- Starting eval on experience 1 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 18.19it/s]\n",
      "> Eval on experience 1 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 102.8548\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp001 = 0.9564\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp001 = 2.6842\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.0054\n",
      "-- Starting eval on experience 2 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 17.99it/s]\n",
      "> Eval on experience 2 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 102.8182\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp002 = 0.2186\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp002 = 0.8319\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp002 = 0.7644\n",
      "-- Starting eval on experience 3 (Task 0) from test stream --\n",
      "100%|██████████| 22/22 [00:01<00:00, 17.42it/s]\n",
      "> Eval on experience 3 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 102.7304\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp003 = 0.5317\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp003 = 0.9772\n",
      "-- Starting eval on experience 4 (Task 0) from test stream --\n",
      "100%|██████████| 19/19 [00:01<00:00, 17.69it/s]\n",
      "> Eval on experience 4 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 102.8340\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp004 = 3.8580\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp004 = 0.0000\n",
      "-- >> End of eval phase << --\n",
      "\tConfusionMatrix_Stream/eval_phase/test_stream = \n",
      "tensor([[ 727,    1,    0,  252,    0,    0,    0,    0,    0,    0],\n",
      "        [   0, 1127,    0,    8,    0,    0,    0,    0,    0,    0],\n",
      "        [  80,  236,   11,  681,    0,    0,    0,   24,    0,    0],\n",
      "        [   0,   33,    0,  969,    0,    0,    0,    8,    0,    0],\n",
      "        [  44,  149,    0,  256,    0,    0,    0,  533,    0,    0],\n",
      "        [  10,  144,    0,  719,    0,    0,    0,   19,    0,    0],\n",
      "        [ 246,  169,    1,  517,    0,    0,    0,   25,    0,    0],\n",
      "        [   2,  119,    0,   99,    0,    0,    0,  808,    0,    0],\n",
      "        [   2,  189,    0,  774,    0,    0,    0,    9,    0,    0],\n",
      "        [  23,   81,    0,  429,    0,    0,    0,  476,    0,    0]])\n",
      "\tDiskUsage_Stream/eval_phase/test_stream/Task000 = 547942.4619\n",
      "\tLoss_Stream/eval_phase/test_stream/Task000 = 2.1660\n",
      "\tStreamForgetting/eval_phase/test_stream = 0.7143\n",
      "\tTop1_Acc_Stream/eval_phase/test_stream/Task000 = 0.3642\n",
      "-- >> Start of training phase << --\n",
      "100%|██████████| 23/23 [00:01<00:00, 14.53it/s]\n",
      "Epoch 0 ended.\n",
      "\tDiskUsage_Epoch/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tDiskUsage_MB/train_phase/train_stream/Task000 = 547942.4619\n",
      "\tLoss_Epoch/train_phase/train_stream/Task000 = 2.5584\n",
      "\tLoss_MB/train_phase/train_stream/Task000 = 1.2959\n",
      "\tTime_Epoch/train_phase/train_stream/Task000 = 1.5327\n",
      "\tTop1_Acc_Epoch/train_phase/train_stream/Task000 = 0.1822\n",
      "\tTop1_Acc_MB/train_phase/train_stream/Task000 = 0.7139\n",
      "-- >> End of training phase << --\n",
      "Training completed\n",
      "Computing accuracy on the whole test set\n",
      "-- >> Start of eval phase << --\n",
      "-- Starting eval on experience 0 (Task 0) from test stream --\n",
      "100%|██████████| 20/20 [00:01<00:00, 18.30it/s]\n",
      "> Eval on experience 0 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 102.9717\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp000 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp000 = 0.9678\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp000 = 2.7437\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0000\n",
      "-- Starting eval on experience 1 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 18.54it/s]\n",
      "> Eval on experience 1 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 103.2760\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp001 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp001 = 0.9613\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp001 = 2.4918\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.0005\n",
      "-- Starting eval on experience 2 (Task 0) from test stream --\n",
      "100%|██████████| 21/21 [00:01<00:00, 18.45it/s]\n",
      "> Eval on experience 2 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 102.8688\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp002 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp002 = 0.3352\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp002 = 1.4392\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp002 = 0.6479\n",
      "-- Starting eval on experience 3 (Task 0) from test stream --\n",
      "100%|██████████| 22/22 [00:01<00:00, 17.58it/s]\n",
      "> Eval on experience 3 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 102.6826\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp003 = 547942.4619\n",
      "\tExperienceForgetting/eval_phase/test_stream/Task000/Exp003 = 0.2033\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp003 = 0.9682\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp003 = 0.7739\n",
      "-- Starting eval on experience 4 (Task 0) from test stream --\n",
      "100%|██████████| 19/19 [00:01<00:00, 17.53it/s]\n",
      "> Eval on experience 4 (Task 0) from test stream ended.\n",
      "\tCPUUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 102.6702\n",
      "\tDiskUsage_Exp/eval_phase/test_stream/Task000/Exp004 = 547942.4619\n",
      "\tLoss_Exp/eval_phase/test_stream/Task000/Exp004 = 1.1828\n",
      "\tTop1_Acc_Exp/eval_phase/test_stream/Task000/Exp004 = 0.7989\n",
      "-- >> End of eval phase << --\n",
      "\tConfusionMatrix_Stream/eval_phase/test_stream = \n",
      "tensor([[ 537,    3,    0,   91,    0,   52,  296,    1,    0,    0],\n",
      "        [   0, 1125,    0,    1,    0,    1,    8,    0,    0,    0],\n",
      "        [   1,  253,    1,   50,    0,    4,  714,    9,    0,    0],\n",
      "        [   0,  108,    0,  535,    0,  231,  133,    3,    0,    0],\n",
      "        [   4,  116,    0,   17,    0,   52,  469,  300,    0,   24],\n",
      "        [   0,  103,    0,   40,    0,  547,  193,    9,    0,    0],\n",
      "        [   2,   19,    0,    2,    0,    3,  931,    1,    0,    0],\n",
      "        [   1,  189,    0,   13,    0,   13,   48,  764,    0,    0],\n",
      "        [   0,  272,    0,   35,    0,  319,  344,    4,    0,    0],\n",
      "        [   4,  117,    0,   16,    0,  195,  351,  326,    0,    0]])\n",
      "\tDiskUsage_Stream/eval_phase/test_stream/Task000 = 547942.4619\n",
      "\tLoss_Stream/eval_phase/test_stream/Task000 = 1.7608\n",
      "\tStreamForgetting/eval_phase/test_stream = 0.6169\n",
      "\tTop1_Acc_Stream/eval_phase/test_stream/Task000 = 0.4440\n"
     ]
    }
   ],
   "source": [
    "from torch.nn import CrossEntropyLoss\n",
    "from torch.optim import SGD\n",
    "from avalanche.benchmarks.classic import SplitMNIST\n",
    "from avalanche.evaluation.metrics import forgetting_metrics, \\\n",
    "accuracy_metrics, loss_metrics, timing_metrics, cpu_usage_metrics, \\\n",
    "confusion_matrix_metrics, disk_usage_metrics\n",
    "from avalanche.models import SimpleMLP\n",
    "from avalanche.logging import InteractiveLogger\n",
    "from avalanche.training.plugins import EvaluationPlugin\n",
    "from avalanche.training import Naive\n",
    "\n",
    "benchmark = SplitMNIST(n_experiences=5)\n",
    "\n",
    "# MODEL CREATION\n",
    "model = SimpleMLP(num_classes=benchmark.n_classes)\n",
    "\n",
    "# DEFINE THE EVALUATION PLUGIN\n",
    "# The evaluation plugin manages the metrics computation.\n",
    "# It takes as argument a list of metrics, collectes their results and returns\n",
    "# them to the strategy it is attached to.\n",
    "\n",
    "eval_plugin = EvaluationPlugin(\n",
    "    accuracy_metrics(minibatch=True, epoch=True, experience=True, stream=True),\n",
    "    loss_metrics(minibatch=True, epoch=True, experience=True, stream=True),\n",
    "    timing_metrics(epoch=True),\n",
    "    forgetting_metrics(experience=True, stream=True),\n",
    "    cpu_usage_metrics(experience=True),\n",
    "    confusion_matrix_metrics(num_classes=benchmark.n_classes, save_image=False, stream=True),\n",
    "    disk_usage_metrics(minibatch=True, epoch=True, experience=True, stream=True),\n",
    "    loggers=[InteractiveLogger()],\n",
    "    strict_checks=False\n",
    ")\n",
    "\n",
    "# CREATE THE STRATEGY INSTANCE (NAIVE)\n",
    "cl_strategy = Naive(\n",
    "    model, SGD(model.parameters(), lr=0.001, momentum=0.9),\n",
    "    CrossEntropyLoss(), train_mb_size=500, train_epochs=1, eval_mb_size=100,\n",
    "    evaluator=eval_plugin)\n",
    "\n",
    "# TRAINING LOOP\n",
    "print('Starting experiment...')\n",
    "results = []\n",
    "for experience in benchmark.train_stream:\n",
    "    # train returns a dictionary which contains all the metric values\n",
    "    res = cl_strategy.train(experience)\n",
    "    print('Training completed')\n",
    "\n",
    "    print('Computing accuracy on the whole test set')\n",
    "    # test also returns a dictionary which contains all the metric values\n",
    "    results.append(cl_strategy.eval(benchmark.test_stream))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Implement your own metric\n",
    "\n",
    "To implement a **standalone metric**, you have to subclass `Metric` class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.evaluation import Metric\n",
    "\n",
    "\n",
    "# a standalone metric implementation\n",
    "class MyStandaloneMetric(Metric[float]):\n",
    "    \"\"\"\n",
    "    This metric will return a `float` value\n",
    "    \"\"\"\n",
    "    def __init__(self):\n",
    "        \"\"\"\n",
    "        Initialize your metric here\n",
    "        \"\"\"\n",
    "        super().__init__()\n",
    "        pass\n",
    "\n",
    "    def update(self):\n",
    "        \"\"\"\n",
    "        Update metric value here\n",
    "        \"\"\"\n",
    "        pass\n",
    "\n",
    "    def result(self, **kwargs) -> float:\n",
    "        \"\"\"\n",
    "        Emit the metric result here\n",
    "        \"\"\"\n",
    "        return 0\n",
    "\n",
    "    def reset(self, **kwargs):\n",
    "        \"\"\"\n",
    "        Reset your metric here\n",
    "        \"\"\"\n",
    "        pass"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " To implement a **plugin metric** you have to subclass `PluginMetric` class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "from avalanche.evaluation import PluginMetric\n",
    "from avalanche.evaluation.metrics import Accuracy\n",
    "from avalanche.evaluation.metric_results import MetricValue\n",
    "from avalanche.evaluation.metric_utils import get_metric_name\n",
    "\n",
    "\n",
    "class MyPluginMetric(PluginMetric[float]):\n",
    "    \"\"\"\n",
    "    This metric will return a `float` value after\n",
    "    each training epoch\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self):\n",
    "        \"\"\"\n",
    "        Initialize the metric\n",
    "        \"\"\"\n",
    "        super().__init__()\n",
    "\n",
    "        self._accuracy_metric = Accuracy()\n",
    "\n",
    "    def reset(self, **kwargs) -> None:\n",
    "        \"\"\"\n",
    "        Reset the metric\n",
    "        \"\"\"\n",
    "        self._accuracy_metric.reset()\n",
    "\n",
    "    def result(self, **kwargs) -> float:\n",
    "        \"\"\"\n",
    "        Emit the result\n",
    "        \"\"\"\n",
    "        return self._accuracy_metric.result()\n",
    "\n",
    "    def after_training_iteration(self, strategy: 'PluggableStrategy') -> None:\n",
    "        \"\"\"\n",
    "        Update the accuracy metric with the current\n",
    "        predictions and targets\n",
    "        \"\"\"\n",
    "        # task labels defined for each experience\n",
    "        task_labels = strategy.experience.task_labels\n",
    "        if len(task_labels) > 1:\n",
    "            # task labels defined for each pattern\n",
    "            task_labels = strategy.mb_task_id\n",
    "        else:\n",
    "            task_labels = task_labels[0]\n",
    "            \n",
    "        self._accuracy_metric.update(strategy.mb_output, strategy.mb_y, \n",
    "                                     task_labels)\n",
    "\n",
    "    def before_training_epoch(self, strategy: 'PluggableStrategy') -> None:\n",
    "        \"\"\"\n",
    "        Reset the accuracy before the epoch begins\n",
    "        \"\"\"\n",
    "        self.reset()\n",
    "\n",
    "    def after_training_epoch(self, strategy: 'PluggableStrategy'):\n",
    "        \"\"\"\n",
    "        Emit the result\n",
    "        \"\"\"\n",
    "        return self._package_result(strategy)\n",
    "        \n",
    "        \n",
    "    def _package_result(self, strategy):\n",
    "        \"\"\"Taken from `GenericPluginMetric`, check that class out!\"\"\"\n",
    "        metric_value = self.accuracy_metric.result()\n",
    "        add_exp = False\n",
    "        plot_x_position = strategy.clock.train_iterations\n",
    "\n",
    "        if isinstance(metric_value, dict):\n",
    "            metrics = []\n",
    "            for k, v in metric_value.items():\n",
    "                metric_name = get_metric_name(\n",
    "                    self, strategy, add_experience=add_exp, add_task=k)\n",
    "                metrics.append(MetricValue(self, metric_name, v,\n",
    "                                           plot_x_position))\n",
    "            return metrics\n",
    "        else:\n",
    "            metric_name = get_metric_name(self, strategy,\n",
    "                                          add_experience=add_exp,\n",
    "                                          add_task=True)\n",
    "            return [MetricValue(self, metric_name, metric_value,\n",
    "                                plot_x_position)]\n",
    "\n",
    "    def __str__(self):\n",
    "        \"\"\"\n",
    "        Here you can specify the name of your metric\n",
    "        \"\"\"\n",
    "        return \"Top1_Acc_Epoch\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Accessing metric values"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to access all the metrics computed during training and evaluation, you have to make sure that `collect_all=True` is set when creating the `EvaluationPlugin` (default option is `True`). This option maintains an updated version of all metric results in the plugin, which can be retrieved by calling `evaluation_plugin.get_all_metrics()`. You can call this methods whenever you need the metrics. \n",
    "\n",
    "The result is a dictionary with full metric names as keys and a tuple of two lists as values. The first list stores all the `x` values recorded for that metric. Each `x` value represents the time step at which the corresponding metric value has been computed. The second list stores metric values associated to the corresponding `x` value. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "defaultdict(<function _init_metrics_list_lambda at 0x16a258b80>, {})\n"
     ]
    }
   ],
   "source": [
    "eval_plugin2 = EvaluationPlugin(\n",
    "    accuracy_metrics(minibatch=True, epoch=True, experience=True, stream=True),\n",
    "    loss_metrics(minibatch=True, epoch=True, experience=True, stream=True),\n",
    "    forgetting_metrics(experience=True, stream=True),\n",
    "    timing_metrics(epoch=True),\n",
    "    cpu_usage_metrics(experience=True),\n",
    "    confusion_matrix_metrics(num_classes=benchmark.n_classes, save_image=False, stream=True),\n",
    "    disk_usage_metrics(minibatch=True, epoch=True, experience=True, stream=True),\n",
    "    collect_all=True, # this is default value anyway\n",
    "    loggers=[InteractiveLogger()]\n",
    ")\n",
    "\n",
    "# since no training and evaluation has been performed, this will return an empty dict.\n",
    "metric_dict = eval_plugin2.get_all_metrics()\n",
    "print(metric_dict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([24, 48, 73, 99, 122],\n",
       " [0.6578294706234499,\n",
       "  0.4541026287058033,\n",
       "  0.44880210042664914,\n",
       "  0.3800978792822186,\n",
       "  0.18220301613898932])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d = eval_plugin.get_all_metrics()\n",
    "d['Top1_Acc_Epoch/train_phase/train_stream/Task000']"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, the `train` and `eval` method of every `strategy` returns a dictionary storing, for each metric, the last value recorded for that metric. You can use these dictionaries to incrementally accumulate metrics. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Top1_Acc_MB/train_phase/train_stream/Task000': 0.7138643067846607, 'Loss_MB/train_phase/train_stream/Task000': 1.2959309816360474, 'DiskUsage_MB/train_phase/train_stream/Task000': 547942.4619140625, 'Top1_Acc_Epoch/train_phase/train_stream/Task000': 0.18220301613898932, 'Loss_Epoch/train_phase/train_stream/Task000': 2.5583770229383505, 'Time_Epoch/train_phase/train_stream/Task000': 1.5327221659999992, 'DiskUsage_Epoch/train_phase/train_stream/Task000': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp000': 0.0, 'Loss_Exp/eval_phase/test_stream/Task000/Exp000': 3.187013988845919, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp000': 102.9050442924926, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp000': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp001': 0.005389514943655071, 'Loss_Exp/eval_phase/test_stream/Task000/Exp001': 2.6841572535382596, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp001': 102.85476214447425, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp001': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp002': 0.7644422310756972, 'Loss_Exp/eval_phase/test_stream/Task000/Exp002': 0.8318540411760132, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp002': 102.8181578306788, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp002': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp003': 0.9771561771561772, 'Loss_Exp/eval_phase/test_stream/Task000/Exp003': 0.5317016594059818, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp003': 102.73038526110618, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp003': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp004': 0.0, 'Loss_Exp/eval_phase/test_stream/Task000/Exp004': 3.858012818001412, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp004': 102.83401924961339, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp004': 547942.4619140625, 'Top1_Acc_Stream/eval_phase/test_stream/Task000': 0.3642, 'Loss_Stream/eval_phase/test_stream/Task000': 2.1660351004064085, 'StreamForgetting/eval_phase/test_stream': 0.7142702778659012, 'ConfusionMatrix_Stream/eval_phase/test_stream': tensor([[ 727,    1,    0,  252,    0,    0,    0,    0,    0,    0],\n",
      "        [   0, 1127,    0,    8,    0,    0,    0,    0,    0,    0],\n",
      "        [  80,  236,   11,  681,    0,    0,    0,   24,    0,    0],\n",
      "        [   0,   33,    0,  969,    0,    0,    0,    8,    0,    0],\n",
      "        [  44,  149,    0,  256,    0,    0,    0,  533,    0,    0],\n",
      "        [  10,  144,    0,  719,    0,    0,    0,   19,    0,    0],\n",
      "        [ 246,  169,    1,  517,    0,    0,    0,   25,    0,    0],\n",
      "        [   2,  119,    0,   99,    0,    0,    0,  808,    0,    0],\n",
      "        [   2,  189,    0,  774,    0,    0,    0,    9,    0,    0],\n",
      "        [  23,   81,    0,  429,    0,    0,    0,  476,    0,    0]]), 'DiskUsage_Stream/eval_phase/test_stream/Task000': 547942.4619140625, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp000': 0.9677914110429447, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp001': 0.9563939245467908, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp002': 0.21862549800796816}\n"
     ]
    }
   ],
   "source": [
    "print(res)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Top1_Acc_MB/train_phase/train_stream/Task000': 0.7138643067846607, 'Loss_MB/train_phase/train_stream/Task000': 1.2959309816360474, 'DiskUsage_MB/train_phase/train_stream/Task000': 547942.4619140625, 'Top1_Acc_Epoch/train_phase/train_stream/Task000': 0.18220301613898932, 'Loss_Epoch/train_phase/train_stream/Task000': 2.5583770229383505, 'Time_Epoch/train_phase/train_stream/Task000': 1.5327221659999992, 'DiskUsage_Epoch/train_phase/train_stream/Task000': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp000': 0.0, 'Loss_Exp/eval_phase/test_stream/Task000/Exp000': 2.743731581116503, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp000': 102.97172345909856, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp000': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp001': 0.0004899559039686428, 'Loss_Exp/eval_phase/test_stream/Task000/Exp001': 2.4918276796382997, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp001': 103.27601095356307, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp001': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp002': 0.6479083665338645, 'Loss_Exp/eval_phase/test_stream/Task000/Exp002': 1.439215256873355, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp002': 102.868754724034, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp002': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp003': 0.7738927738927739, 'Loss_Exp/eval_phase/test_stream/Task000/Exp003': 0.9682179094750286, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp003': 102.6825800596294, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp003': 547942.4619140625, 'Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp004': 0.798918918918919, 'Loss_Exp/eval_phase/test_stream/Task000/Exp004': 1.18284226430429, 'CPUUsage_Exp/eval_phase/test_stream/Task000/Exp004': 102.67016110172969, 'DiskUsage_Exp/eval_phase/test_stream/Task000/Exp004': 547942.4619140625, 'Top1_Acc_Stream/eval_phase/test_stream/Task000': 0.444, 'Loss_Stream/eval_phase/test_stream/Task000': 1.760758910739422, 'StreamForgetting/eval_phase/test_stream': 0.6168769151106566, 'ConfusionMatrix_Stream/eval_phase/test_stream': tensor([[ 537,    3,    0,   91,    0,   52,  296,    1,    0,    0],\n",
      "        [   0, 1125,    0,    1,    0,    1,    8,    0,    0,    0],\n",
      "        [   1,  253,    1,   50,    0,    4,  714,    9,    0,    0],\n",
      "        [   0,  108,    0,  535,    0,  231,  133,    3,    0,    0],\n",
      "        [   4,  116,    0,   17,    0,   52,  469,  300,    0,   24],\n",
      "        [   0,  103,    0,   40,    0,  547,  193,    9,    0,    0],\n",
      "        [   2,   19,    0,    2,    0,    3,  931,    1,    0,    0],\n",
      "        [   1,  189,    0,   13,    0,   13,   48,  764,    0,    0],\n",
      "        [   0,  272,    0,   35,    0,  319,  344,    4,    0,    0],\n",
      "        [   4,  117,    0,   16,    0,  195,  351,  326,    0,    0]]), 'DiskUsage_Stream/eval_phase/test_stream/Task000': 547942.4619140625, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp000': 0.9677914110429447, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp001': 0.9612934835864773, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp002': 0.33515936254980083, 'ExperienceForgetting/eval_phase/test_stream/Task000/Exp003': 0.2032634032634033}\n"
     ]
    }
   ],
   "source": [
    "print(results[-1])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This completes the \"_Evaluation_\" tutorial for the \"_From Zero to Hero_\" series. We hope you enjoyed it!\n",
    "\n",
    "## 🤝 Run it on Google Colab\n",
    "\n",
    "You can run _this chapter_ and play with it on Google Colaboratory: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ContinualAI/avalanche/blob/master/notebooks/from-zero-to-hero-tutorial/05_evaluation.ipynb)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
